Overview

Dataset statistics

Number of variables30
Number of observations227628
Missing cells941504
Missing cells (%)13.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory47.5 MiB
Average record size in memory219.0 B

Variable types

CAT18
NUM8
BOOL4

Warnings

country has constant value "227628" Constant
months_per_bill_period has constant value "227628" Constant
language has constant value "227628" Constant
account_creation_date has a high cardinality: 226023 distinct values High cardinality
trial_end_date has a high cardinality: 226023 distinct values High cardinality
last_payment has a high cardinality: 131339 distinct values High cardinality
next_payment has a high cardinality: 129206 distinct values High cardinality
cancel_date has a high cardinality: 284 distinct values High cardinality
discount_price is highly correlated with monthly_price and 1 other fieldsHigh correlation
monthly_price is highly correlated with discount_price and 1 other fieldsHigh correlation
num_trial_days is highly correlated with monthly_price and 1 other fieldsHigh correlation
num_trial_days is highly correlated with plan_typeHigh correlation
plan_type is highly correlated with num_trial_daysHigh correlation
package_type has 35574 (15.6%) missing values Missing
num_weekly_services_utilized has 110450 (48.5%) missing values Missing
preferred_genre has 36326 (16.0%) missing values Missing
intended_use has 3549 (1.6%) missing values Missing
weekly_consumption_hour has 37930 (16.7%) missing values Missing
num_ideal_streaming_services has 112170 (49.3%) missing values Missing
age has 35169 (15.5%) missing values Missing
attribution_survey has 2644 (1.2%) missing values Missing
op_sys has 13375 (5.9%) missing values Missing
join_fee has 34904 (15.3%) missing values Missing
payment_type has 135578 (59.6%) missing values Missing
last_payment has 95391 (41.9%) missing values Missing
next_payment has 97378 (42.8%) missing values Missing
cancel_date has 190797 (83.8%) missing values Missing
age is highly skewed (γ1 = 403.3291721) Skewed
monthly_price is highly skewed (γ1 = -35.47057095) Skewed
discount_price is highly skewed (γ1 = -34.15730579) Skewed
account_creation_date is uniformly distributed Uniform
trial_end_date is uniformly distributed Uniform
last_payment is uniformly distributed Uniform
next_payment is uniformly distributed Uniform
subid has unique values Unique
join_fee has 33482 (14.7%) zeros Zeros

Reproduction

Analysis started2020-12-04 13:26:07.485832
Analysis finished2020-12-04 13:26:39.172769
Duration31.69 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

subid
Real number (ℝ≥0)

UNIQUE

Distinct227628
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24986239.81
Minimum20000009
Maximum29999982
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB

Quantile statistics

Minimum20000009
5-th percentile20496214.8
Q122489975.5
median24970107.5
Q327490204.75
95-th percentile29492191.3
Maximum29999982
Range9999973
Interquartile range (IQR)5000229.25

Descriptive statistics

Standard deviation2885543.308
Coefficient of variation (CV)0.1154852963
Kurtosis-1.199613914
Mean24986239.81
Median Absolute Deviation (MAD)2500579
Skewness0.007929852742
Sum5.687567796e+12
Variance8.32636018e+12
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
246435831< 0.1%
 
299039671< 0.1%
 
226233441< 0.1%
 
284294231< 0.1%
 
215593921< 0.1%
 
226704431< 0.1%
 
200530971< 0.1%
 
268667921< 0.1%
 
273665001< 0.1%
 
259910421< 0.1%
 
Other values (227618)227618> 99.9%
 
ValueCountFrequency (%) 
200000091< 0.1%
 
200000481< 0.1%
 
200000621< 0.1%
 
200000941< 0.1%
 
200001041< 0.1%
 
ValueCountFrequency (%) 
299999821< 0.1%
 
299999451< 0.1%
 
299999041< 0.1%
 
299998891< 0.1%
 
299998791< 0.1%
 

package_type
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing35574
Missing (%)15.6%
Memory size1.7 MiB
base
111464 
enhanced
63241 
economy
17349 
ValueCountFrequency (%) 
base11146449.0%
 
enhanced6324127.8%
 
economy173497.6%
 
(Missing)3557415.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length8
Median length4
Mean length5.183672483
Min length3

num_weekly_services_utilized
Real number (ℝ≥0)

MISSING

Distinct12
Distinct (%)< 0.1%
Missing110450
Missing (%)48.5%
Infinite0
Infinite (%)0.0%
Mean3.008824182
Minimum0
Maximum14
Zeros2
Zeros (%)< 0.1%
Memory size1.7 MiB

Quantile statistics

Minimum0
5-th percentile2
Q12
median3
Q33
95-th percentile5
Maximum14
Range14
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8205257244
Coefficient of variation (CV)0.2727064377
Kurtosis2.350893503
Mean3.008824182
Median Absolute Deviation (MAD)0
Skewness1.022780299
Sum352568
Variance0.6732624644
MonotocityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%) 
36394428.1%
 
22977313.1%
 
4173237.6%
 
550332.2%
 
69040.4%
 
71380.1%
 
124< 0.1%
 
822< 0.1%
 
911< 0.1%
 
103< 0.1%
 
Other values (2)3< 0.1%
 
(Missing)11045048.5%
 
ValueCountFrequency (%) 
02< 0.1%
 
124< 0.1%
 
22977313.1%
 
36394428.1%
 
4173237.6%
 
ValueCountFrequency (%) 
141< 0.1%
 
103< 0.1%
 
911< 0.1%
 
822< 0.1%
 
71380.1%
 

preferred_genre
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing36326
Missing (%)16.0%
Memory size1.7 MiB
comedy
125129 
drama
46872 
regional
 
8990
international
 
6404
other
 
3907
ValueCountFrequency (%) 
comedy12512955.0%
 
drama4687220.6%
 
regional89903.9%
 
international64042.8%
 
other39071.7%
 
(Missing)3632616.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length13
Median length6
Mean length5.574090182
Min length3

intended_use
Categorical

MISSING

Distinct7
Distinct (%)< 0.1%
Missing3549
Missing (%)1.6%
Memory size1.7 MiB
access to exclusive content
89039 
replace OTT
69185 
supplement OTT
26603 
expand regional access
14025 
expand international access
12978 
Other values (2)
12249 
ValueCountFrequency (%) 
access to exclusive content8903939.1%
 
replace OTT6918530.4%
 
supplement OTT2660311.7%
 
expand regional access140256.2%
 
expand international access129785.7%
 
other71123.1%
 
education51372.3%
 
(Missing)35491.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length27
Median length22
Mean length18.84182087
Min length3

weekly_consumption_hour
Real number (ℝ)

MISSING

Distinct81
Distinct (%)< 0.1%
Missing37930
Missing (%)16.7%
Infinite0
Infinite (%)0.0%
Mean27.99772411
Minimum-32.1467596
Maximum76.59996225
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB

Quantile statistics

Minimum-32.1467596
5-th percentile21.50162318
Q124.40153577
median27.30144835
Q330.20136093
95-th percentile37.45114239
Maximum76.59996225
Range108.7467219
Interquartile range (IQR)5.799825165

Descriptive statistics

Standard deviation4.976340586
Coefficient of variation (CV)0.1777408966
Kurtosis3.166855179
Mean27.99772411
Median Absolute Deviation (MAD)2.899912583
Skewness0.6415334006
Sum5311112.268
Variance24.76396563
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
25.851492062599011.4%
 
28.751404642294710.1%
 
27.30144835225519.9%
 
24.40153577199688.8%
 
30.20136093183668.1%
 
22.95157947168397.4%
 
31.65131722132625.8%
 
33.1012735194114.1%
 
21.5016231887203.8%
 
34.551229874243.3%
 
Other values (71)2422010.6%
 
(Missing)3793016.7%
 
ValueCountFrequency (%) 
-32.14675961< 0.1%
 
-29.246847014< 0.1%
 
-27.796890721< 0.1%
 
-23.447021851< 0.1%
 
-13.297327811< 0.1%
 
ValueCountFrequency (%) 
76.599962253< 0.1%
 
75.150005963< 0.1%
 
73.700049671< 0.1%
 
72.250093385< 0.1%
 
67.900224515< 0.1%
 

num_ideal_streaming_services
Real number (ℝ)

MISSING

Distinct8
Distinct (%)< 0.1%
Missing112170
Missing (%)49.3%
Infinite0
Infinite (%)0.0%
Mean2.061260372
Minimum-1
Maximum7
Zeros4
Zeros (%)< 0.1%
Memory size1.7 MiB

Quantile statistics

Minimum-1
5-th percentile2
Q12
median2
Q32
95-th percentile3
Maximum7
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2459068491
Coefficient of variation (CV)0.1192992659
Kurtosis14.18046344
Mean2.061260372
Median Absolute Deviation (MAD)0
Skewness3.601631299
Sum237989
Variance0.06047017843
MonotocityNot monotonic
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
210824547.6%
 
370723.1%
 
189< 0.1%
 
442< 0.1%
 
54< 0.1%
 
04< 0.1%
 
71< 0.1%
 
-11< 0.1%
 
(Missing)11217049.3%
 
ValueCountFrequency (%) 
-11< 0.1%
 
04< 0.1%
 
189< 0.1%
 
210824547.6%
 
370723.1%
 
ValueCountFrequency (%) 
71< 0.1%
 
54< 0.1%
 
442< 0.1%
 
370723.1%
 
210824547.6%
 

age
Real number (ℝ≥0)

MISSING
SKEWED

Distinct278
Distinct (%)0.1%
Missing35169
Missing (%)15.5%
Infinite0
Infinite (%)0.0%
Mean757.9754574
Minimum0
Maximum81720000
Zeros64
Zeros (%)< 0.1%
Memory size1.7 MiB

Quantile statistics

Minimum0
5-th percentile24
Q135
median46
Q357
95-th percentile70
Maximum81720000
Range81720000
Interquartile range (IQR)22

Descriptive statistics

Standard deviation192020.4428
Coefficient of variation (CV)253.3333248
Kurtosis170584.1771
Mean757.9754574
Median Absolute Deviation (MAD)11
Skewness403.3291721
Sum145879198.6
Variance3.687185044e+10
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
5063552.8%
 
4055092.4%
 
6052212.3%
 
4846512.0%
 
4745172.0%
 
5545092.0%
 
4944832.0%
 
5244402.0%
 
4343571.9%
 
4243461.9%
 
Other values (268)14407163.3%
 
(Missing)3516915.5%
 
ValueCountFrequency (%) 
064< 0.1%
 
103< 0.1%
 
161< 0.1%
 
1810970.5%
 
196320.3%
 
ValueCountFrequency (%) 
817200001< 0.1%
 
110220001< 0.1%
 
103019491< 0.1%
 
81219301< 0.1%
 
80619901< 0.1%
 

male_TF
Boolean

Distinct2
Distinct (%)< 0.1%
Missing269
Missing (%)0.1%
Memory size1.7 MiB
False
200902 
True
26457 
(Missing)
 
269
ValueCountFrequency (%) 
False20090288.3%
 
True2645711.6%
 
(Missing)2690.1%
 

country
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
UAE
227628 
ValueCountFrequency (%) 
UAE227628100.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3
Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
facebook
80251 
email
25690 
search
25306 
organic
22013 
brand sem intent google
18524 
Other values (28)
55844 
ValueCountFrequency (%) 
facebook8025135.3%
 
email2569011.3%
 
search2530611.1%
 
organic220139.7%
 
brand sem intent google185248.1%
 
google_organic106914.7%
 
affiliate98944.3%
 
email_blast72773.2%
 
pinterest60652.7%
 
referral51702.3%
 
Other values (23)167477.4%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length23
Median length8
Mean length9.204082977
Min length2

attribution_survey
Categorical

MISSING

Distinct16
Distinct (%)< 0.1%
Missing2644
Missing (%)1.2%
Memory size1.7 MiB
facebook
119126 
tv
39904 
referral
20882 
search
 
8492
pinterest
 
7856
Other values (11)
28724 
ValueCountFrequency (%) 
facebook11912652.3%
 
tv3990417.5%
 
referral208829.2%
 
search84923.7%
 
pinterest78563.5%
 
other64962.9%
 
public_radio62192.7%
 
social_organic38691.7%
 
youtube31081.4%
 
podcast29951.3%
 
Other values (6)60372.7%
 
(Missing)26441.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length16
Median length8
Mean length6.965918077
Min length2

op_sys
Categorical

MISSING

Distinct2
Distinct (%)< 0.1%
Missing13375
Missing (%)5.9%
Memory size1.7 MiB
iOS
143921 
Android
70332 
ValueCountFrequency (%) 
iOS14392163.2%
 
Android7033230.9%
 
(Missing)133755.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length7
Median length3
Mean length4.235911224
Min length3

months_per_bill_period
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
4
227628 
ValueCountFrequency (%) 
4227628100.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

plan_type
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
base_uae_14_day_trial
227096 
high_uae_14_day_trial
 
325
low_uae_no_trial
 
167
base_eur_14_day_trial
 
18
high_sar_14_day_trial
 
12
Other values (6)
 
10
ValueCountFrequency (%) 
base_uae_14_day_trial22709699.8%
 
high_uae_14_day_trial3250.1%
 
low_uae_no_trial1670.1%
 
base_eur_14_day_trial18< 0.1%
 
high_sar_14_day_trial12< 0.1%
 
low_gbp_14_day_trial4< 0.1%
 
high_aud_14_day_trial2< 0.1%
 
high_jpy_14_day_trial1< 0.1%
 
low_sar_no_trial1< 0.1%
 
low_eur_no_trial1< 0.1%
 
Frequencies of value counts

Unique

Unique4 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length33
Median length21
Mean length20.99632295
Min length16

monthly_price
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.732049419
Minimum0.8074
Maximum5.1013
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB

Quantile statistics

Minimum0.8074
5-th percentile4.7343
Q14.7343
median4.7343
Q34.7343
95-th percentile4.7343
Maximum5.1013
Range4.2939
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.1010488003
Coefficient of variation (CV)0.02135413039
Kurtosis1287.47917
Mean4.732049419
Median Absolute Deviation (MAD)0
Skewness-35.47057095
Sum1077146.945
Variance0.01021086004
MonotocityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4.734322710199.8%
 
5.10133250.1%
 
1.06431670.1%
 
4.440718< 0.1%
 
4.367312< 0.1%
 
4.00032< 0.1%
 
1.17441< 0.1%
 
0.80741< 0.1%
 
4.69761< 0.1%
 
ValueCountFrequency (%) 
0.80741< 0.1%
 
1.06431670.1%
 
1.17441< 0.1%
 
4.00032< 0.1%
 
4.367312< 0.1%
 
ValueCountFrequency (%) 
5.10133250.1%
 
4.734322710199.8%
 
4.69761< 0.1%
 
4.440718< 0.1%
 
4.367312< 0.1%
 

discount_price
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.512188481
Minimum0.7707
Maximum5.0279
Zeros0
Zeros (%)0.0%
Memory size1.7 MiB

Quantile statistics

Minimum0.7707
5-th percentile4.5141
Q14.5141
median4.5141
Q34.5141
95-th percentile4.5141
Maximum5.0279
Range4.2572
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.09707790925
Coefficient of variation (CV)0.02151459533
Kurtosis1231.974423
Mean4.512188481
Median Absolute Deviation (MAD)0
Skewness-34.15730579
Sum1027100.44
Variance0.009424120464
MonotocityNot monotonic
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4.514122709699.8%
 
5.02793250.1%
 
1.02761670.1%
 
4.220518< 0.1%
 
4.073712< 0.1%
 
4.36734< 0.1%
 
4.44072< 0.1%
 
3.78012< 0.1%
 
1.17441< 0.1%
 
0.77071< 0.1%
 
ValueCountFrequency (%) 
0.77071< 0.1%
 
1.02761670.1%
 
1.17441< 0.1%
 
3.78012< 0.1%
 
4.073712< 0.1%
 
ValueCountFrequency (%) 
5.02793250.1%
 
4.514122709699.8%
 
4.44072< 0.1%
 
4.36734< 0.1%
 
4.220518< 0.1%
 

account_creation_date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct226023
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
2020-03-14 18:22:06
 
3
2019-07-01 01:21:10
 
3
2020-02-29 17:58:30
 
3
2019-06-30 14:47:57
 
3
2019-11-17 16:24:52
 
3
Other values (226018)
227613 
ValueCountFrequency (%) 
2020-03-14 18:22:063< 0.1%
 
2019-07-01 01:21:103< 0.1%
 
2020-02-29 17:58:303< 0.1%
 
2019-06-30 14:47:573< 0.1%
 
2019-11-17 16:24:523< 0.1%
 
2019-07-02 14:58:453< 0.1%
 
2020-02-29 19:26:263< 0.1%
 
2019-11-30 02:48:103< 0.1%
 
2019-12-28 16:35:343< 0.1%
 
2020-03-15 20:09:502< 0.1%
 
Other values (226013)227599> 99.9%
 
Frequencies of value counts

Unique

Unique224427 ?
Unique (%)98.6%
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19

trial_end_date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct226023
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
2020-01-11 16:35:34
 
3
2019-07-14 14:47:57
 
3
2019-12-14 02:48:10
 
3
2019-07-16 14:58:45
 
3
2019-12-01 16:24:52
 
3
Other values (226018)
227613 
ValueCountFrequency (%) 
2020-01-11 16:35:343< 0.1%
 
2019-07-14 14:47:573< 0.1%
 
2019-12-14 02:48:103< 0.1%
 
2019-07-16 14:58:453< 0.1%
 
2019-12-01 16:24:523< 0.1%
 
2020-03-14 19:26:263< 0.1%
 
2020-03-28 18:22:063< 0.1%
 
2019-07-15 01:21:103< 0.1%
 
2020-03-14 17:58:303< 0.1%
 
2019-12-10 00:24:382< 0.1%
 
Other values (226013)227599> 99.9%
 
Frequencies of value counts

Unique

Unique224427 ?
Unique (%)98.6%
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length19
Min length19
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size222.3 KiB
False
216425 
True
 
11203
ValueCountFrequency (%) 
False21642595.1%
 
True112034.9%
 

join_fee
Real number (ℝ)

MISSING
ZEROS

Distinct20
Distinct (%)< 0.1%
Missing34904
Missing (%)15.3%
Infinite0
Infinite (%)0.0%
Mean0.1151453836
Minimum-0.6606
Maximum0.734
Zeros33482
Zeros (%)14.7%
Memory size1.7 MiB

Quantile statistics

Minimum-0.6606
5-th percentile0
Q10.0367
median0.0367
Q30.1101
95-th percentile0.6606
Maximum0.734
Range1.3946
Interquartile range (IQR)0.0734

Descriptive statistics

Standard deviation0.1769706831
Coefficient of variation (CV)1.536932507
Kurtosis3.131999623
Mean0.1151453836
Median Absolute Deviation (MAD)0
Skewness2.026261329
Sum22191.2789
Variance0.03131862266
MonotocityNot monotonic
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%) 
0.036710948248.1%
 
03348214.7%
 
0.33032531411.1%
 
0.6606122825.4%
 
0.1101106484.7%
 
0.36713330.6%
 
0.18351460.1%
 
-0.03679< 0.1%
 
0.69738< 0.1%
 
0.62396< 0.1%
 
Other values (10)14< 0.1%
 
(Missing)3490415.3%
 
ValueCountFrequency (%) 
-0.66062< 0.1%
 
-0.33031< 0.1%
 
-0.11011< 0.1%
 
-0.03679< 0.1%
 
03348214.7%
 
ValueCountFrequency (%) 
0.7341< 0.1%
 
0.69738< 0.1%
 
0.6606122825.4%
 
0.62396< 0.1%
 
0.58721< 0.1%
 

language
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
ar
227628 
ValueCountFrequency (%) 
ar227628100.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

payment_type
Categorical

MISSING

Distinct6
Distinct (%)< 0.1%
Missing135578
Missing (%)59.6%
Memory size1.7 MiB
Standard Charter
38810 
Paypal
30911 
RAKBANK
14831 
CBD
5080 
Najim
 
2414
ValueCountFrequency (%) 
Standard Charter3881017.0%
 
Paypal3091113.6%
 
RAKBANK148316.5%
 
CBD50802.2%
 
Najim24141.1%
 
Apple Pay4< 0.1%
 
(Missing)13557859.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length16
Median length3
Mean length5.90578927
Min length3

num_trial_days
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
14
227458 
0
 
170
ValueCountFrequency (%) 
1422745899.9%
 
01700.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length1.999253167
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size222.3 KiB
True
130250 
False
97378 
ValueCountFrequency (%) 
True13025057.2%
 
False9737842.8%
 

payment_period
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
0
95391 
1
86968 
2
42921 
3
 
2348
ValueCountFrequency (%) 
09539141.9%
 
18696838.2%
 
24292118.9%
 
323481.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

last_payment
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct131339
Distinct (%)99.3%
Missing95391
Missing (%)41.9%
Memory size1.7 MiB
2020-03-14 19:26:26
 
3
2020-03-08 19:00:43
 
3
2019-12-14 02:48:10
 
3
2020-02-29 12:16:18
 
3
2020-02-14 16:53:19
 
2
Other values (131334)
132223 
ValueCountFrequency (%) 
2020-03-14 19:26:263< 0.1%
 
2020-03-08 19:00:433< 0.1%
 
2019-12-14 02:48:103< 0.1%
 
2020-02-29 12:16:183< 0.1%
 
2020-02-14 16:53:192< 0.1%
 
2020-01-22 14:57:282< 0.1%
 
2020-01-18 01:17:022< 0.1%
 
2020-03-25 05:48:582< 0.1%
 
2020-02-14 16:18:492< 0.1%
 
2020-01-16 11:50:042< 0.1%
 
Other values (131329)13221358.1%
 
(Missing)9539141.9%
 
Frequencies of value counts

Unique

Unique130445 ?
Unique (%)98.6%
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length12.29495493
Min length3

next_payment
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct129206
Distinct (%)99.2%
Missing97378
Missing (%)42.8%
Memory size1.7 MiB
2020-07-08 19:00:43
 
3
2020-04-14 02:48:10
 
3
2020-03-28 18:22:06
 
3
2020-07-14 19:26:26
 
3
2020-06-29 12:16:18
 
3
Other values (129201)
130235 
ValueCountFrequency (%) 
2020-07-08 19:00:433< 0.1%
 
2020-04-14 02:48:103< 0.1%
 
2020-03-28 18:22:063< 0.1%
 
2020-07-14 19:26:263< 0.1%
 
2020-06-29 12:16:183< 0.1%
 
2020-05-24 21:19:342< 0.1%
 
2020-05-12 16:04:442< 0.1%
 
2020-06-24 23:09:502< 0.1%
 
2020-04-02 15:12:042< 0.1%
 
2020-04-13 19:44:042< 0.1%
 
Other values (129196)13022557.2%
 
(Missing)9737842.8%
 
Frequencies of value counts

Unique

Unique128167 ?
Unique (%)98.4%
Histogram of lengths of the category

Length

Max length19
Median length19
Mean length12.15528845
Min length3

cancel_date
Categorical

HIGH CARDINALITY
MISSING

Distinct284
Distinct (%)0.8%
Missing190797
Missing (%)83.8%
Memory size1.7 MiB
2019-07-13 00:00:00
 
431
2019-07-12 00:00:00
 
396
2019-07-14 00:00:00
 
384
2019-07-15 00:00:00
 
348
2019-07-11 00:00:00
 
321
Other values (279)
34951 
ValueCountFrequency (%) 
2019-07-13 00:00:004310.2%
 
2019-07-12 00:00:003960.2%
 
2019-07-14 00:00:003840.2%
 
2019-07-15 00:00:003480.2%
 
2019-07-11 00:00:003210.1%
 
2019-07-16 00:00:003170.1%
 
2019-07-17 00:00:003000.1%
 
2019-07-10 00:00:002880.1%
 
2019-07-09 00:00:002580.1%
 
2019-07-18 00:00:002540.1%
 
Other values (274)3353414.7%
 
(Missing)19079783.8%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length19
Median length3
Mean length5.588855501
Min length3
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size222.3 KiB
True
200236 
False
27392 
ValueCountFrequency (%) 
True20023688.0%
 
False2739212.0%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

subidpackage_typenum_weekly_services_utilizedpreferred_genreintended_useweekly_consumption_hournum_ideal_streaming_servicesagemale_TFcountryattribution_technicalattribution_surveyop_sysmonths_per_bill_periodplan_typemonthly_pricediscount_priceaccount_creation_datetrial_end_dateinitial_credit_card_declinedjoin_feelanguagepayment_typenum_trial_dayscurrent_sub_TFpayment_periodlast_paymentnext_paymentcancel_datetrial_completed
021724479economyNaNcomedyaccess to exclusive contentNaNNaNNaNFalseUAEfacebookfacebookAndroid4base_uae_14_day_trial4.73434.51412020-01-24 21:44:162020-02-07 21:44:16FalseNaNarStandard Charter14True12020-02-07 21:44:162020-06-07 21:44:16NaNTrue
123383224baseNaNcomedyaccess to exclusive content22.951579NaN70.0FalseUAEfacebookfacebookNaN4base_uae_14_day_trial4.73434.51412020-03-01 15:44:352020-03-15 15:44:35False0.3303arNaN14True12020-03-15 15:44:352020-07-15 15:44:35NaNTrue
226844789enhanced3.0regionalreplace OTT36.0011862.025.0TrueUAEorganicfacebookiOS4base_uae_14_day_trial4.73434.51412019-12-07 16:37:062019-12-21 16:37:06False0.1101arNaN14False0NaNNaNNaNTrue
329417030baseNaNdramareplace OTT20.051667NaN30.0FalseUAEsearchtvAndroid4base_uae_14_day_trial4.73434.51412020-01-27 16:09:322020-02-10 16:09:32False0.0367arNaN14False0NaNNaNNaNTrue
426723159base4.0comedyreplace OTT22.9515793.028.0FalseUAEdiscoveryyoutubeiOS4base_uae_14_day_trial4.73434.51412019-10-05 12:57:072019-10-19 12:57:07False0.0367arNaN14True22020-02-19 12:57:072020-06-19 12:57:07NaNTrue
524810928baseNaNcomedyaccess to exclusive content20.051667NaN70.0FalseUAEbingtvNaN4base_uae_14_day_trial4.73434.51412020-03-03 20:15:432020-03-17 20:15:43False0.3303arRAKBANK14True12020-03-17 20:15:432020-07-17 20:15:43NaNTrue
629726122base2.0comedyaccess to exclusive content20.0516672.061.0FalseUAEbingsearchAndroid4base_uae_14_day_trial4.73434.51412020-02-19 18:30:152020-03-04 18:30:15False0.3303arStandard Charter14True12020-03-04 18:30:152020-07-04 18:30:15NaNTrue
720299962base3.0dramaaccess to exclusive content34.5512302.023.0FalseUAEemailreferraliOS4base_uae_14_day_trial4.73434.51412020-03-05 14:52:222020-03-19 14:52:22False0.0000arRAKBANK14True12020-03-19 14:52:222020-07-19 14:52:22NaNTrue
824930568baseNaNcomedyaccess to exclusive content25.851492NaN73.0FalseUAEfacebookfacebookiOS4base_uae_14_day_trial4.73434.51412020-02-23 17:50:252020-03-08 17:50:25False0.6606arNaN14True12020-03-08 17:50:252020-07-08 17:50:25NaNTrue
923452753economy3.0dramareplace OTT28.7514052.071.0FalseUAEsearchfacebookAndroid4base_uae_14_day_trial4.73434.51412020-01-21 14:17:532020-02-04 14:17:53False0.3303arNaN14False0NaNNaN2020-01-27 00:00:00False

Last rows

subidpackage_typenum_weekly_services_utilizedpreferred_genreintended_useweekly_consumption_hournum_ideal_streaming_servicesagemale_TFcountryattribution_technicalattribution_surveyop_sysmonths_per_bill_periodplan_typemonthly_pricediscount_priceaccount_creation_datetrial_end_dateinitial_credit_card_declinedjoin_feelanguagepayment_typenum_trial_dayscurrent_sub_TFpayment_periodlast_paymentnext_paymentcancel_datetrial_completed
22761822218943economyNaNcomedyreplace OTT37.451142NaN67.0TrueUAEbrand sem intent bingreferraliOS4base_uae_14_day_trial4.73434.51412019-11-16 02:53:502019-11-30 02:53:50False0.0367arNaN14False0NaNNaN2019-11-27 00:00:00False
22761925492551base3.0comedyaccess to exclusive content30.2013612.032.0FalseUAEemailfacebookAndroid4base_uae_14_day_trial4.73434.51412019-09-30 22:07:372019-10-14 22:07:37False0.0000arNaN14True22020-02-14 22:07:372020-06-14 22:07:37NaNTrue
22762021928274baseNaNdramareplace OTTNaNNaNNaNFalseUAEfacebookfacebookAndroid4base_uae_14_day_trial4.73434.51412020-01-14 02:04:522020-01-28 02:04:52FalseNaNarRAKBANK14False0NaNNaNNaNTrue
22762125549852enhancedNaNcomedyaccess to exclusive content28.751405NaN61.0FalseUAEaffiliatefacebookAndroid4base_uae_14_day_trial4.73434.51412020-03-06 02:57:032020-03-20 02:57:03False0.3303arNaN14True12020-03-20 02:57:032020-07-20 02:57:03NaNTrue
22762225835684base2.0dramaaccess to exclusive content24.4015362.043.0FalseUAEemailpinterestiOS4base_uae_14_day_trial4.73434.51412020-01-01 22:43:562020-01-15 22:43:56False0.0000arNaN14True12020-01-15 22:43:562020-05-15 22:43:56NaNTrue
22762321434712enhanced3.0comedysupplement OTT28.7514052.038.0FalseUAEfacebookfacebook_organiciOS4base_uae_14_day_trial4.73434.51412019-11-17 14:12:332019-12-01 14:12:33False0.3303arNaN14True12019-12-01 14:12:332020-04-01 14:12:33NaNTrue
22762425843074enhanced2.0comedyreplace OTT27.3014482.049.0FalseUAEgoogle_organicreferraliOS4base_uae_14_day_trial4.73434.51412019-12-06 18:02:132019-12-20 18:02:13False0.3303arPaypal14True12019-12-20 18:02:132020-04-20 18:02:13NaNTrue
22762524799085baseNaNcomedyaccess to exclusive content31.651317NaN45.0FalseUAEfacebookfacebookiOS4base_uae_14_day_trial4.73434.51412019-12-21 19:40:442020-01-04 19:40:44True0.0367arNaN14True12020-01-04 19:40:442020-05-04 19:40:44NaNTrue
22762621308040baseNaNcomedyaccess to exclusive contentNaNNaNNaNFalseUAEfacebookfacebookiOS4base_uae_14_day_trial4.73434.51412020-01-17 23:58:512020-01-31 23:58:51FalseNaNarPaypal14True12020-01-31 23:58:512020-05-31 23:58:51NaNTrue
22762720166335baseNaNcomedyreplace OTT25.851492NaN55.0FalseUAEorganictviOS4base_uae_14_day_trial4.73434.51412019-11-26 19:09:092019-12-10 19:09:09False0.0367arNaN14False0NaNNaN2019-12-09 00:00:00False